3,021 research outputs found
Robust Ranking Explanations
Robust explanations of machine learning models are critical to establish
human trust in the models. Due to limited cognition capability, most humans can
only interpret the top few salient features. It is critical to make top salient
features robust to adversarial attacks, especially those against the more
vulnerable gradient-based explanations. Existing defense measures robustness
using -norms, which have weaker protection power. We define explanation
thickness for measuring salient features ranking stability, and derive
tractable surrogate bounds of the thickness to design the \textit{R2ET}
algorithm to efficiently maximize the thickness and anchor top salient
features. Theoretically, we prove a connection between R2ET and adversarial
training. Experiments with a wide spectrum of network architectures and data
modalities, including brain networks, demonstrate that R2ET attains higher
explanation robustness under stealthy attacks while retaining accuracy.Comment: Accepted to IMLH (Interpretable ML in Healthcare) workshop at ICML
2023. arXiv admin note: substantial text overlap with arXiv:2212.1410
Erasing-based lossless compression method for streaming floating-point time series
There are a prohibitively large number of floating-point time series data
generated at an unprecedentedly high rate. An efficient, compact and lossless
compression for time series data is of great importance for a wide range of
scenarios. Most existing lossless floating-point compression methods are based
on the XOR operation, but they do not fully exploit the trailing zeros, which
usually results in an unsatisfactory compression ratio. This paper proposes an
Erasing-based Lossless Floating-point compression algorithm, i.e., Elf. The
main idea of Elf is to erase the last few bits (i.e., set them to zero) of
floating-point values, so the XORed values are supposed to contain many
trailing zeros. The challenges of the erasing-based method are three-fold.
First, how to quickly determine the erased bits? Second, how to losslessly
recover the original data from the erased ones? Third, how to compactly encode
the erased data? Through rigorous mathematical analysis, Elf can directly
determine the erased bits and restore the original values without losing any
precision. To further improve the compression ratio, we propose a novel
encoding strategy for the XORed values with many trailing zeros. Furthermore,
observing the values in a time series usually have similar significand counts,
we propose an upgraded version of Elf named Elf+ by optimizing the significand
count encoding strategy, which improves the compression ratio and reduces the
running time further. Both Elf and Elf+ work in a streaming fashion. They take
only O(N) (where N is the length of a time series) in time and O(1) in space,
and achieve a notable compression ratio with a theoretical guarantee. Extensive
experiments using 22 datasets show the powerful performance of Elf and Elf+
compared with 9 advanced competitors for both double-precision and
single-precision floating-point values
Recommended from our members
Joint Modeling of Linkage and Association Using Affected Sib-pair Data
There has been a growing interest in developing strategies for identifying single-nucleotide polymorphisms (SNPs) that explain a linkage signal by joint modeling of linkage and association. We compare several existing methods and propose a new method called the homozygote sharing transmission-disequilibrium test (HSTDT) to detect linkage and association or to identify SNPs explaining the linkage signal on chromosome 6 for rheumatoid arthritis using 100 replicates of the Genetic Analysis Workshop (GAW) 15 simulated affected sib-pair data. Existing methods considered included the family-based tests of association implemented in FBAT, a transmission-disequilibrium test, a conditional logistic regression approach, a likelihood-based approach implemented in LAMP, and the homozygote sharing test (HST). We compared the type I error rates and power for tests classified into three categories according to their null hypotheses: 1) no association in the presence of linkage (i.e., a SNP explains none of the linkage evidence), 2) no linkage adjusting for the association (i.e., a SNP explains all linkage evidence), and 3) no linkage and no association. For testing association in the presence of linkage, we found similar power among all tests except for the homozygote sharing test that had lower power. When testing linkage adjusting for association, similar power was observed between LAMP and HST, but lower power for the conditional logistic regression method. When testing linkage or association, the conditional logistic regression method was more powerful than FBAT
Joint modeling of linkage and association using affected sib-pair data
There has been a growing interest in developing strategies for identifying single-nucleotide polymorphisms (SNPs) that explain a linkage signal by joint modeling of linkage and association. We compare several existing methods and propose a new method called the homozygote sharing transmission-disequilibrium test (HSTDT) to detect linkage and association or to identify SNPs explaining the linkage signal on chromosome 6 for rheumatoid arthritis using 100 replicates of the Genetic Analysis Workshop (GAW) 15 simulated affected sib-pair data. Existing methods considered included the family-based tests of association implemented in FBAT, a transmission-disequilibrium test, a conditional logistic regression approach, a likelihood-based approach implemented in LAMP, and the homozygote sharing test (HST). We compared the type I error rates and power for tests classified into three categories according to their null hypotheses: 1) no association in the presence of linkage (i.e., a SNP explains none of the linkage evidence), 2) no linkage adjusting for the association (i.e., a SNP explains all linkage evidence), and 3) no linkage and no association. For testing association in the presence of linkage, we found similar power among all tests except for the homozygote sharing test that had lower power. When testing linkage adjusting for association, similar power was observed between LAMP and HST, but lower power for the conditional logistic regression method. When testing linkage or association, the conditional logistic regression method was more powerful than FBAT
- …